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Abstract- The aim of the research is to separate the 
foreground and background in naturai images. The objects 
separation is performed by anaiysing the boundary area 
between foreground and background or the unknown. The 
anaiysis of unknown area is used to determine the 
threshoid vaiue to separate definitive foreground and 
background in aipha matting. The process begins with 
defining a sub-image of the grayscaie image dataset with 
Region of interest (ROi). Furthermore, the features of each 
sub-image consisting of contrast, correiation, energy and 
entropy are extracted using the Grey Levei Co-Occurrence 
Matrix (GLCM) in angies of 0°, 45°, 90°, and 135°. Locai 
extraction resuits are averaged and normaiized and then, it 
is treated as a threshoid for aipha matting. The resuit is 
evaiuated using Peak Signai Noise to Ratio (PSNR) and 
shows a significant increase in performance. 

Keywords: alpha matting, threshold, region of interest and 
GLCM. 

I. INTRODUCTION 

In recent decades, researches related to the 
extraction of objects are massively performed as its 
accuracy will greatly affect the quality of the image or 
video editing. In addition, object extraction for image or 
video editing is very important in multimedia applications 
due to the dramatic increase of the growth of the network 
multimedia industry. In the beginning. Porter and Duff 
[Ijintroduce alpha channels as a function to control linear 
interpolation of foreground and background colours for 
anti-aliasing purposes when the foreground joins the 
background. This is called "pulling matte" or "digital 
matting" technique in object extraction. 

"Pulling matte" is performed by combining the 
semi-transparent colours of the foreground with the 
background to produce a new blend colour. The degree 
of gradation of the foreground colour ranges from full 
black to white. Mixed colour will be foreground if it is full 
white and be the background for black. Mixed colours are 
the measured average of foreground and background 
colours. The accuracy of separation of foreground and 
background within the boundary of the object determines 
the success of its process. 

A qualified matte extraction result should have an 
even colour distribution, neither too white nor black. If the 
pixel is too white, it will be dominant in the area correlated 
with extracted foreground, and it will be the contrary if it is 
too black which will be the background. The object 
separation of segmentation and matting are different in 
how to treat foreground and background pixels. In the 
hard segmentation technique, the process is firmly 
performed against the pixels, so that the pixel becomes 


part of the foreground or background only. In contrast to 
the matting technique, the unknown region pixel (a = 
alpha) becomes part of the foreground and background. 

The withdrawal process of alpha values (alpha 
matting) is performed by differentiating the pixels as part 
of the foreground and background. Alpha matting is a 
convex combination of two colours allowing the 
transparency effect in computer graphics. Alpha values 
range is 0.0 - 1.0 with a full transparent value of 0.0 and 
full opaque value of 1.0, and the unknown region is 
determined by defining a threshold value. 

Initially Levin et al[2]use the alpha threshold was 
defined as 0.17 - 0.15 assuming that the noise value in 
an image is within the range. Threshold definition is 
performed based on user perception by considering the 
characteristics of the image extracted, so certain 
expertise is needed in determining the threshold in order 
to get a qualified matte. User error in determining the 
threshold will affect the quality of matte. To overcome this 
problem, an adaptive threshold-based algorithm is 
proposed to determine the threshold value referring to 
image characteristics used as alpha threshold. 

Computation of alpha channel based on global 
adaptive threshold is performed by using Fuzzy C-Means 
algorithm [3] and linear optimization [4]. However, the 
change in illumination causes certain parts to be brighter 
and darker on another. To overcome the problem, local 
adaptive threshold is applied by dividing the image into 
several sub-images, which then compute the threshold 
value by normalizing feature extraction. Determination of 
the threshold value is performed by calculating the 
Region of Interest (ROI) as a block based processing 
model of the image for initialization in determining the 
background and background areas. The purpose of 
selecting an ROI area is intended to divide the image into 
sub-images. Furthermore, the sub-image is used as a 
basis for analysing and testing extracted features from the 
selected image area. Threshold value is determined from 
the normalization of the average feature extraction value 
which consists of contrast, correlation, energy and 
entropy. 

II. RELATED WORKS 

Research related to alpha matting-based image 
segmentation has been carried out in recent decades. J. 
Wang and M.F. Cohen applies 
trimap[5][6][7][8][9][10]which is a pre-segmented image 
to distinguish the foreground, background and unknown 
areas. Limitation of the approach is misclassification of 
colour samples in complex scenes. Levin et al [11] define 


822 


www.ijitce.co.uk 


International Journal of Innovative Technology and Creative Engineering (ISSN:2045-8711) 

VOL.ION 0.7 JULY 2020 


user-specific-constraints by applying manual scribble as 
sampling-sets to define foreground (F) and background 
(B) to overcome this problem, while unknown or alpha (a) 
is determined by the specified threshold value of 0 and 1. 
The use of trimap in separating the foreground and 
background areas results a visually near perfect[12]. 
However, the thickness level of manual scratches in the 
process of defining scribble requires a high level 
experience, especially in complex and complicated 
images such as the image of hair, feathers and falling 
snow [13]. 

The determination of threshold values in the 
range of 0-1 for alpha (a) values is adapted to Fuzzy C- 
Means by Basuki, et al [3]. The threshold value is 
obtained by calculating the average maximum value in 
the class having smallest and lowest middle value among 
others, where the pixel value uses three class concepts 
in Fuzzy C-Means. Thresholding in this method is 
calculated by collecting clusters in the areas having the 
same level of similarity and proximity to each other, by 
developing grey-level similarity of grey based on inter¬ 
class and intra-class so that the separation of 
background, foreground, and alpha is hoped to be more 
expressive. This technique is then used repetitively for 
segmenting semi-automatic video objects [14][15]. 

Before, threshold determination in images was 
globally calculated. P. Case and H. R. Rana use local 
thresholding for alpha matting to improve the level of 
threshold quality in image segmentation [16]. 
Combinations of image segmentation techniques are 
performed to obtain optimal results, including Edge and 
Region Based Segmentation. The Grey-Histogram 
Technique and Gradient-Based Method are used to 
define alpha matting of Edge Base Segmentation, and 
Thresholding method (Local and Global) and Region 
Operating are for Region Based Segmentation. From the 
experiments, it is concluded that each segmentation 
technique in image matting has both advantages and 
disadvantages that lie in the homogeneity of the natural 
image dataset used, spatial character structure, and 
image texture. Therefore, they propose KNN Matting 
because it can integrate both. 

In general, the threshold is divided into two: 
global and local threshold. The problem of using global 
threshold is that there is a change in illumination 

compared to the same T value for the entire pixel. This 
will cause certain parts to be brighter and darker in others 
(for example, shadows of objects in the original image). 
However, these problems is overcome by local 
thresholding [17] adaptively applied to several techniques 
such as Niblack’s Techniques, Sauvola’s Technique, 
Bernsen’s Technique, Yanowitz and Bruckstein’s 
Method, and Maximum Entropy [18]. Local adaptive 
threshold is able to produce optimal values in the 

segmentation of an image because the threshold is 
calculated by dividing the image into new sub-images and 
not from the entire surface as one. 

Based on the previous research, the alpha 

threshold will be calculated in the local adaptive 


threshold, where the maximum entropy threshold value 
obtained from the Region of Interest (ROI) area is 
normalized as the input of alpha threshold. The feature 
extraction from each sub-image of ROI produces a 
successful result for determining the threshold [19]. 
Feature extraction applied is the GLCM (Grey Level Co- 
Occurrence Matrix) which is a feature extraction method 
using statistical analysis of using grayscale images. In 
addition, GLCM is also able to examine textures by 
considering the spatial relationship of pixels in an image. 

III. RESEARCH FRAMEWORK 

The proposed feature-based object extraction system 
is illustrated in Fig. 1. The first step in the extraction 
process began with image acquisition as a source of 
data analysed. Each image in the RGB domain was 
transformed into the grayscale domain to simplify the 
computational process. The next grayscale image was 
cut and divided into 16 sub-image blocks using the 
Region of Interest (ROI) method. It was performed in 
order to locally calculate the feature. 



Fig. 1. Research framework 

Feature extraction was performed over the sub¬ 
images generated from ROI by the parameters of 
contrast, correlation, energy and entropy in angles 0°, 
45°, 90° and 135° using the GLCM (Grey Level of Co- 
Occurrence Matrix) method. The feature extraction 
results were then normalized and treated as a 
threshold value (a) in the image matting. The accuracy 
of the extracted object was evaluated using PSNR 
(Peak Signal Noise to Ratio) by comparing the matte 
produced by the system with the ground truth (matte 
reference from the dataset). 

IV. DISCUSSION 

The matte extraction test was performed using a 
public matting dataset consisting ofteddy.bmp, kid.bmp, 
teddy_ear.bmp, fire.bmp and hair.bmp (as well as shown 
in Fig. 2.), in following stages: 

• Image transformation: image conversion from RGB to 
Grayscale domain aimed to simplify the computational 
process. The image composition with 1 (one) channel 
would be simpler compared to 3 (three) channels with 
a value range of 0 - 255. The conversion process was 
performed, following the Equation (1). 
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Graylmage =0.29S9R + 0.5870G + 0.1140^ (1) 

In which R is red, G is green and B is blue 
channel of the intensity value.In addition, the 
computational process was also simplified for the 
efficiency of processing time by considering the 
threshold value so that the intensity value became 
binary (0 and 1) so that it was able to be used in the 
extraction process. 

• Region of Interest (ROI):Region of Interest is a 
part of an image identified for a particular purpose. In 
this research, the determination of ROI was performed 
using the block processing method [20] where the 
converted image in the grayscale domain was divided 
into 16 blocks of the same size as shown in Fig. 3. 


1 ) Contrast 

Contrast was a measure of the existence 
of variations in the level of gray pixel images, 
calculated using Equation (2). 

Contrast = ^ GLCM (/, j) 

n=l 

in which L was the number of levels used in 
computing, nwas the number of pixels, iwas the 
smallest pixel intensity, and jwas the largest pixel 
intensity. 

2 ) Correlation 

Correlation is a measure of linear dependency 
between gray levels in an image which is 


( 2 ) 




Fig. 2. Dataset of image matting, while figure (a) to (e) showing original image and figure (f) to (j) showing image reference 


Furthermore, ROI was treated as a sub-image in 
which each block was labelled as initialization to 
distinguish the foreground and background regions. 

• Feature extraction:GLCM (Gray Level Co-Occurrence 
Matrix) was used to extract features which based on 
the two order texture calculation to calculate the 
relationship of pairs of two pixels in the original image. 
Sub-images were used as data sources tested. The 
GLCM features analyzed were: Contrast, Correlation, 
Energy, and Entropy. Feature extraction was 
performed by taking a window of 5x5 pixels from each 
sub-image. The window was taken from the upper left 
corner of each sub image as shown in Fig. 3 and 
calculated to determine the value of each feature as 
follows: 


calculated using the following Equation (3). 


Correlation = 


(*’ j)~ 


( 3 ) 


in which L was the number of levels used in the 
computation proses, n was number of pixels, i was 
the smallest pixel intensity, and j was the largest 
pixel intensity. Then, a/a/was deviation standard 
for all pixel intensity in matrix GLCM. 

3 ) Energy 

Energy was the intensity measure of region area 
variation, calculated by the Equation (4). 
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G-1 , 

Energy = ^[GZ,CM(0] 

(4) 

in which i was the pixel intensity average in matrix 
GLCM. 

4) Entropy 

Entropy was used to express the size of gray level 
irregularities in the image calculated by applying 
the Equation (5). 

L L 

Entropy = (z, j)\og(GLCM (/, j) 

(5) 

Feature value determination of contrast, 
correlation, energy and entropy at rotation angles 
of 00, 450, 950 and 1350was applied in all 
windows in each sub image. The average value in 
eachfeature value was summed and divided by the 
number of features analyzed. 

Alpha matting: A qualified matte extraction result 
should have an even color distribution, neither too 
white nor black. If the pixel is too white, it will be 
dominant in the area correlated with extracted 
foreground, and it will be the contrary if it is too black 


which will be the background.The object separation of 
segmentation and matting were different in how to 
treat foreground and background pixels. In the hard 
segmentation technique, the process was firmly 
performed against the pixels, so that the pixel would 
be part of the foreground or background only. In 
matting technique, the unknown region pixel (a = 
alpha) would be part of the foreground and 
background. Thus, the threshold value in this area 
would decide the quality of separation result.The 
object separation process was performed by the 
assumption that dominant pixels with the white {a= ^) 
would correlate with foreground and black {a = 0) with 
background. The accuracy will meet the problem in the 
unknown region which is located at the edge of an 
object. Image matting operations were performed by 
specifying a threshold value between 0-1 to define the 
unknown region value.The threshold determination 
was previously performed by Levin and Lischinski 
where the alpha threshold was defined 
betweenO.l^ - 0.1^, assuming that the noise value in 
an image ranges within. Threshold definition was 
performed based on user perception by considering 
characteristics of the image extracted, so users need 
certain expertise in determining the threshold in order 
to obtain the best result. User error in determining the 
threshold will affect the quality of matte [2]. Basuki et 
al, propose an adaptive threshold-based algorithm by 
applying Fuzzy C-Means to determine threshold 
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Fig. 3. Image transformation and ROI determination 
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values referring to the image characteristics used as 
thresholds in the unknown region to overcome this 
problem [3][21].ln this research, the threshold value 
was calculated in each sub-image resulting from the 
local determination of ROI. The average value of 
feature extraction from GLCM (contrast, correlation, 
energy and entropy) in each sub-image (Equation 2 - 
5) was normalized by dividing the entropy value in 
each sub-image by the average number of GLCM 
values in an image as I as shown in Equation (6). 


compared to closed-form solution with FCM as alpha 
threshold [3],[14],[15]. Evaluation of each image was 
performed using 

PSNR (Peak Signal Noise to Ratio) as in the 
Equation 9 and the results are shown in Fig. 5. 


PSNR = lOlog,, 


255 ^ 

MSB 


( 8 ) 


in which 



(a) (b) (c) (d) (c) 

Fig. 4. Extraction stage : (a), original image, (b). scribble image, (c). matte reference from dataset, (d). matte extraction from the 
proposed approach, and (e). object extraction the result from the proposed approach 


Threshold = 


Average {Contras^ +Correlation^ + Energy^ + Entropy^) 
Yfiverageimage feature 


( 6 ) 


in which s was feature value on sub-image. The result 
of the Equation (6) was treated as threshold value for 
alpha matting [11]as input for Equation (7). 

• Experiment result and evaluation: Testing of the image 

matting dataset was performed by input images in the 
form of original, terrible and reference images as a 
reference for testing comparisons between proposed 
method and intended results. Each image (teddy.bmp, 
teddy_ear.bmp, kid.bmp, fire.bmp and hair.bmp) was 
converted into gray image and divided into 16 blocks 
of sub-image. Each sub-image is taken by a 5x5 pixel 
window which was calculated by GLCM to obtain the 

feature value and normalized to the threshold value 
treated as an input alpha value (a) in the matting 
image as shown in the Equation (7). 

( 7 ) 


MSE = y y ( - matteExt) f 

ttftl MxN J 
With grdlmg was the ground truth image as the 
reference image, matteExt was a matte extracted by 
the system and MxN was the size of the executable 
image. 



Fig. 5. Performance evaluation using PSNR 


in which F^was definitive foreground value, Bi was 
definitive background and at was unknown region. 

Then, matte was extracted using closed-form matting 
[3][11][14][15]. The result was replaced by original 
image in order to obtain object desired as described in 
Fig. 4.The trial result conducted on each of the 
analyzed images showed that the proposed method 
had a significant increase in the performance 


V. CONCLUSION 

Alpha matting based of object extraction was 
tested on 5 images from the matting image dataset. 
Initially, the image was transformed into the grayscale 
domain and divided into 16 sub-images and using ROI for 
each. Each sub-image feature was extracted (local 
threshold) using GLCM with contrast, correlation, energy 
and entropy parameters, then results were normalized 
and treated as threshold values in alpha matting. 
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The trial results showed that the object extraction 
process with a feature-based threshold for natural images 
worked well in extracting matte. The visually matte which 
correlates with foreground pixels was able to correlate 
more accurate. In addition, an increase in accuracy was 
quantitatively shown by the results of evaluations using 
Peak Signal Noise to Ratio (PSNR) which showed an 
average increase up to 63% from the previous 
[3][14][15][21]. 
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